perm filename NLM[AM,DBL] blob
sn#576666 filedate 1981-04-06 generic text, type C, neo UTF8
COMMENT ā VALID 00004 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 PROJECT 2: WORKBENCH FOR KNOWLEDGE REPRESENTATION (summary)
C00021 00003 PROJECT 2: WORKBENCH FOR KNOWLEDGE REPRESENTATION (detailed report)
C00062 00004 BIBLIOGRAPHY
C00064 ENDMK
Cā;
PROJECT 2: WORKBENCH FOR KNOWLEDGE REPRESENTATION (summary)
Investigator: Douglas B. Lenat
1. Objectives
This project's major objective is the application of artificial
intelligence (AI) techniques to various specific tasks (primarily
medical). To achieve this goal, several software packages are being
developed, each of which has more or less general applicability. Some are
still experimental vehicles (RLL, MRS), but most have already seen several
applications (EMYCIN, AGE, UNITS). Together, these packages comprise what
we term a Knowledge Engineer's Workbench; they are tools which facilitate
the rapid construction, testing, modification, and explanation (access) of
knowledge-based programs.
The theme this year has been the facilitating of direct use of these
systems by experts from medical and other disciplines. EMYCIN has
produced a new users' manual, and several new applications have been
developed directly by medical specialists. AGE has improved its debugging
and explanation facilities. UNITS, responding to user demand, has been
speeded up by an order of magnitude, and a new rules language has been
implmented.
2. Studies and Results
Three new applications of EMYCIN were pursued this year. The first, a
medical consultant program, tracks an expectant mother through her
pregnancy, recommending tests, detecting potentially dangerous medical
conditions, and estimating the current age of gestation. A second
consultant identifies probable causes of failures in teleprocessing
subsystems of IBM 370-class computer systems. The third major consultant
now under development seeks to identify rock formations found at various
depths of an oil-well bore hole. Thanks to the improved knowledge
acquisition capabilities of EMYCIN, and the availability of an EMYCIN
manual, each of these three consultants were constructed largely by the
experts themselves. After initial discussions concerned with the design
of the systems goals, the identification of the data to be gathered, and
the basic flow of the consultant's dialogue, the process of writing and
inputting the hundreds of rules and parameters per system has been done
primarily by the expert. All of them have remarked on the ease with which
the current facilities allow this interaction to occur.
The UNITS package has been made into a more comfortable one for users.
This has involved engineering the system to be as bug-free as humanly
possible and improving speed and space in the system by orders of
magnitude. Most of the improvements have come from noting the places in
the system where needless "generality" slowed down the system. Secondly,
a natural "rules language" has been developed, one which easily admits the
description of procedural expertise. In effect, computer-naive experts can
now write simple "programs" in limited contexts during the rule-writing
process.
Following feedback at an AGE workshop, that system's debugging and
traceback explanation facilities were expanded. AGE versions of LITHO and
VM have been created, and a new, complex application problem has been
started, to explore the merits and drawbacks of the currently implemented
AGE.
The research on RLL and MRS began with the goal of giving a system
knowledge of itself and the ability to control its operation and modify
its own structure. With a few simple statements, it is possible to select
a new data structure (e.g. property lists) or enable a different inference
algorithm (e.g. a general procedure like property inheritance or backward
chaining or a domain specific procedural attachment). One goal is to keep
MRS simultaneously maintained in Maclisp, Interlisp, Franz Lisp, and
Lisp370. MRS was used in the DART project (mentioned above) and the
Intelligent Agents project (smart interface for computer operating
systems). The system has been exported for trial use by the
Hewlett-Packard Co. and the Rand Corp. RLL is a program which "knows"
about the components of representation languages in general. It provides
the user with an extendable collection of high level operators and
constructs which he can use to describe and build components of his target
language. After the user has specified the desirable features of the
target language, RLL integrates these components into a functional, new
representation language. In other words, a user can readily and rapidly
design a personalized language, exactly suited to the domain and the
application task at hand.
3. Goals for the Coming Year
In the next year, we expect to accelerate the number and complexity of
applications of our tools, thereby providing more feedback to aid us in
improving them and designing new and better tools. We expect most of these
applications will be built almost exclusively by the interested experts
themselves, with less recourse to programmers and system designers than in
previous years. Some of our more experimental tools, such as MRS and RLL,
will be getting their first real trials this year.
AGE is planning, and has already begun, a large-scale test on a complex
application. A multi-facet display capability is also being designed. MRS
will see the construction of useful new "plug-in" modules and a few
extensions to the inference capabilities. The issues of primary concern
at this point include an implementation of default reasoning and reasoning
with equality. Both MRS and RLL are expanding their capabilities to
reason about their own structure and behavior. RLL is being used to build
the Eurisko system, a program which gathers its own empirical data, and
tries to synthesize new heuristic rules. Each project is also continuing
this year's theme, the increasing facility with which an outside expert
can sit down and construct knowledge based expert system.
PROJECT 2: WORKBENCH FOR KNOWLEDGE REPRESENTATION (detailed report)
Investigator: Douglas B. Lenat
1. Objectives
This project's major objective is the application of artificial
intelligence (AI) techniques to various specific tasks (primarily
medical). To achieve this goal, several software packages are being
developed, each of which has more or less general applicability.
Together, these packages comprise what we term a Knowledge Engineer's
Workbench; they are tools which facilitate the rapid construction,
testing, modification, and explanation (access) of knowledge-based
programs.
The theme this year has been the facilitating of direct use of these
systems by experts from medical and other disciplines. EMYCIN has
produced a new users' manual, and several new applications have been
developed directly by medical specialists, without the need to interact
closely with the system designers and maintainers. In GRAVIDA, a program
for tracking the progress of pregnancies and suggesting tests when
appropriate, Dr. V. Catanzarite entered all 300 rules by himslef. The
LITHO system, which predicts the lithofaces (types of rocks) in a well,
had about 250 rules, also entered directly by the expert, Dr. Jacques
Harry. Using DART, our expert (S. Snyder) entered 190 rules to produce a
model-based diagnoser of teleprocessing faults.
The objective is being reached: experts are now sitting down and doing it
themselves. The theme was echoed in the work this year on AGE. Following
a productive workshop last winter, work has focused on improving the
debugging and traceback facilities. In UNITS, responding to user demand,
the system has been speeded up by an order of magnitude, and a new rules
language has been implmented.
This year, as last year, we have applied our existing packages (AGE,
EMYCIN, UNITS) to new medical tasks. Tools still under development
(CORLL, RLL, MRS) have been tentatively applied to various scientific
tasks, though already we see the tool-like nature of these packages: CORLL
was used to build both RLL and MRS.
2. Studies and Results
2.1 Direct Construction of EMYCIN Consultants by Experts
Three new applications of EMYCIN were pursued this year. The first, a
medical consultant called GRAVIDA, was developed to track an expectant
mother through her pregnancy. Constructed by Dr. V. Catanzarite, currently
a resident at Santa Clara Valley Hospital, the system acquires information
about current and past medical problems of the mother, any previous
pregnancies, and general historical data about the patient. GRAVIDA then
keeps track of the patient on a per-visit basis, recommending tests,
detecting potentially dangerous medical conditions, and estimating the
current age of gestation. The construction of this consultant required the
extension of the rule language (by adding several new predicate functions)
to look for simple trends and events over a series of previous visits.
The other consultants are applied to non-medical domains. In conjunction
with the IBM Corporation we have developed a consultant, called DART, that
identifies probable causes of failures in teleprocessing subsystems of IBM
370-class computer systems. The system accepts stylized descriptions of
the observed failure (e.g., lost data, machine went into a loop, terminal
doesn't respond, etc.) and then directs the acquisition of data which are
collected from traces available to field service personnel. Finally, DART
uses this data to indict specific components, both hardware and software,
which might be broken.
The other major consultant now under development seeks to identify rock
formations found at various depths of an oil-well bore hole. The
consultant, called LITHO, examines geological and physical data of
individual zones of interest to identify various aspects of the geological
formations. This consultant is being constructed in conjunction with the
Schlumberger Corporation and is similar to the GEO consultant developed
with the AGE system. [qv]
With the publication of a thesis on the EMYCIN system [vanMelle80a],
dealing with the design of improved knowledge acquisition facilties for
EMYCIN, and the availability of an EMYCIN manual [vanMelle80b], each of
these three consultants were constructed largely by the experts
themselves. After initial discussions concerned with the design of the
systems goals, the identification of the data to be gathered, and the
basic flow of the consultant's dialogue, the process of writing and
inputting the hundreds of rules and parameters per system has been done
primarily by the expert. All of them have remarked on the ease with which
the current facilities allow this interaction to occur. As a result of
these experiments numerous improvements and modifications, both to the
EMYCIN system and to the manual, have been incorporated into the package.
2.2 Facilitating the Use of UNITS
Last year, we described in detail the Units system, a frame-based
knowledge representation package, developed in the context of experiment
design in molecular genetics. New molecular biology applications of Units
have been developed, but for this section we shall concentrate on two
major areas in which the Units package itself has progressed as a tool for
building such systems.
The first major push has been to make the package a comfortable one for
users. This has involved engineering the system to be as bug-free as
humanly possible and improving speed and space in the system. Over the
past twelve months, the package has speeded up by at least an order of
magnitude (sometimes far more than that) in most important areas. We have
also developed ways to increase user storage available (but we are still
greatly hampered by address space limitations). We have had the chance to
observe over a dozen user groups and have learned from them. Most of our
improvements have come from noting the places in the system where needless
"generality" slowed down the system. There were dozens of places where a
feature that may have been used once in a thousand times (like a TRANSFER
message on a slot) slowed the routine operation by more than an order of
mag. We have tried to still allow generality, but have left it up to the
user to decide. This work, along with Age, MRS, and RLL, is one of the
first 2nd-generation knowledge representation systems. Much of this is
motivated by the fact that the majority of our knowlege base builders are
almost completely computer-naive.
Our second major push in the last year has been the implementation of a
language for the description of procedural expertise-- a natural "rules
language". This has allowed scientists to custom- design all sorts of
programs from simple data manipulation and consistency checking routines
to more advance planning and reasoning systems like a DNA sequencing
adviser. A paper on this subject has been submitted to IJCAI81 this
coming summer.
2.3 AGE: A Workshop and its Effects
In the AGE system an attempt has been made to isolate inference, control,
and representation techniques from a few previous knowledge-based systems,
and reprogram them for domain independence. AGE is a library of
building-block programs (called "components") combined with an interface
that assists the user in the design and construction of knowledge-based
programs. It is hoped that AGE will speed up the process of building
knowledge-based programs and facilitate the dissemination of AI techniques
by: (1) packaging common AI software tools so that they do not need to be
reprogrammed for every problem; and (2) helping people who are not
knowledge-engineering specialists write knowledge-based programs.
In the past year, the AGE Project was involved in the following four
activities: First, a workshop was help, in which six people were given
hands-on experience learning to use AGE. Second, the debugging and
traceback explanation facilities of AGE have been expanded. Third, we have
embarked upon a new, complex application problem to explore the merits and
drawbacks of the currently implemented AGE. Fourth, a new medical user of
AGE has joined the project.
The objective of holding the AGE Workshop was two-fold: (1) To explore how
to teach potential users of AGE, the details of the system; and (2) To
determine if the current documentations were adequate. It was found that
"typical" users of AGE did not possess enough background knowledge in the
art of building knowledge-based program to use AGE effectively. The
solution to this problem is not obvious. However, we plan in the long-run
to provide a more comprehensive tutorial and program design aids within
AGE. In the mean time a new user will require extensive consultation in
the area of problem formulation before he or she will be ready to use AGE.
It was found that the current documentation is adequate for users already
somewhat familiar with AGE. The documents used for the workshop were "Joy
of AGE-ing" and "AGE Reference Manual". However, we found that one of the
best teaching tools is examples. Since the workshop we have begun a
series on documented examples, of which there are now two in the series.
These examples are actually implemented and running programs; each
document consists of description of the problem, its formulation in terms
of AGE, reasons for the way in which the program was designed, and a
complete program listing.
Debugging and Other Related Facilities: In our programming activities of
the past year, we placed our emphasis on improving the user interface and
on developing debugging facilities. We have implemented run-time trace
and break facilities. We are currently implementing a traceback facility,
a facility whereby the user can obtain an explanation of what happened in
the program after the fact. One option we are adding is the ability of
the system to backchain. Augmentations have been made to the user
interaction functions, the acquisition and editing functions, and the
user's data input functions.
Staff Development of an Application Program: We have found that the
current users of AGE do not take full advantage of the range of
programming possibilities designed into AGE. This is due in large part to
the fact that the users are not familiar enough with various knowledge
engineering techniques, nor familiar enough with AGE. We, therefore, took
on the task of implementing an application problem that was complex enough
to need some of the more advanced features of AGE. At the same time, the
program was designed in such a way as to utilize most of the different AGE
features. From this activity we have found some weak points, as well as
more desirable features, that will be fixed and/or added in the future
versions. Parts of the application will be put in the Example Series
mentioned earlier.
2.4 MRS - A Modifiable Rpresentation System
Our research in knowledge representation has largely focussed on the
problem of giving a system knowledge of itself and the ability to control
its operation and modify its structure. Two outgrowths of this research
are the Modifiable Representation System MRS and the Representation
Language Language RLL (described below). MRS is a minimal
self-descriptive system with the ability to reason about its own structure
and behavior. RLL is intended to have more extensive knowledge about
knowledge representation and thereby facilitate the tailoring the system
for partiular applications. MRS is much smaller than RLL and is conceived
of as the core from which the more knowledgeable RLL will eventually be
built. Superficially, MRS differs from RLL in its external language (full
predicate calculus rather than "units") and in the specific capabilities
available (e.g. general backward chaining).
As a knowledge representation system in its own right, MRS is intended for
use by AI researchers in building expert systems. It offers a repertory
of commands for asserting and retrieving information, with varying degrees
of inference. Information is entered in a predicate calculus-like
language of assertions and is stored in either a propositional network or
any of a variety of specialized data representations (e.g. property lists,
alists, bit vectors). The initial system includes a vocabulary of
concepts and facts about logic, sets, mappings, arithmetic, and
procedures.
What makes MRS special among knowledge representation systems is its
ability to deal with itself in the same way that it deals with application
domains (like geology and medicine). In MRS, the basic representation is
described within the representation itself, and the inference techniques
used in reasoning about application domains can be applied to reasoning
about the system. For example, MRS might use facts about sets and
sequences to determine whether a statement is provable, or it might use
facts about directed graphs to determine that backward chaining is the
most appropriate technique to use with forward branching search spaces.
Because of MRS's formalism and meta-level vocabulary, it's possible for a
user to ask the system questions about itself; and, since the system has a
partial self-description, it can answer many of these questions. More
importantly, the system uses its own description in carrying out each
operation. The upshot is that the user can affect MRS's behavior simply
by modifying this description. With a few simple statements, it is
possible to select a new data structure (e.g. property lists) or enable a
different inference algorithm (e.g. a general procedure like property
inheritance or backward chaining or a domain specific procedural
attachment).
Furthermore, MRS is fully modifiable, i.e. it can be converted into any
LISP program by making assertions within its own formalism (hence the name
"Modifiable Representation System"). Most knowledge representation
systems were developed in the context of particular applications, e.g. KRL
and OWL in natural language and Units in molecular biology. The problem
is that what is good for one application is usually not perfect for
another. Although most systems offer their users a number of options,
there are often undesirable design decisions that cannot be changed. In
MRS modifiability is a design goal, which is realized by the system's
self-descriptive vocabulary and its handling of changes in its own
description.
Of course, bare Lisp is also fully modifiable. One important difference
is that MRS's language is more general than Lisp's, allowing its user to
assert arbitrary facts in the predicate calculus as well as defining
procedures. Furthermore, the transformation to other knowledge
representation systems is facilitated by the knowledge about knowledge
representation built into larger systems like RLL.
State of Implementation and Use
The core of MRS is fully implemented in DEC-20 Maclisp and Interlisp and
will soon be available in Franz Lisp on the Vax and Interlisp on the Xerox
Dolphin. A separate effort is underway to implement the system in IBM's
LISP370. MRS is being used in a variety of research projects at Stanford,
incuding the DART project (automated diagnosis of computer hardware
failures) and the Intelligent Agents project (smart interface for computer
operating systems). The system has been exported for trial use by the
Hewlett-Packard Co. and the Rand Corp.
In its initial state, MRS uses a "propositional" representation for
storing information. The representation is fully indexed and has a
flexible context mechanism. Before carrying out any operation, the system
uses depth-first backward chaining at the meta-level to figure out how to
carry out the operation. Initially, application-level deductions also use
depth-first backward chaining.
In addition to the core system, several "plug-in" modules that
substantially increase the capability of MRS have been built. The agenda
module provides the system with the ability to do breadth-first and
best-first searches in addition to the default depth-first search. The
property list module allows the user to store facts in a frame-like
property list representation instead of the propositional representation.
The demon module allows the user to write general "if-needed" and
"if-removed" methods.
2.5 RLL: a language in which to build new representation tools
RLL is a tool to facilitiate the building of expert programs -- and new
tools for building expert programs -- quickly. RLL is itself an expert
program, whose domain of expertise is knowledge representation. Last year
we mentioned it briefly as a new experimental vehicle; this year it has
come of age as a tool on the knowledge engineer's workbench.
The standard first step taken in building an application program in AI is
the design and implementation of a language in which to represent the
knowledge the program will use. Experience has shown that the language
developed in one application is seldom adaptated for use in other programs
-- the features that were useful for the original problem become
limitations elsewhere. Thus, a specialized representation language is
redesigned and reimplemented for each application -- a very time consuming
task.
RLL (Representation Language Language) is designed to reduce the time
spent building such representation languages. It is a language which
"knows" about the components of representation languages in general. It
provides the user with an extendable collection of high level operators
and constructs which he can use to describe and build components of his
target language. After the user has specified the desirable features of
the target language, RLL integrates these components into a functional,
new representation language. In other words, a user can readily and
rapidly design a personalized language, exactly suited to the domain and
the application task at hand.
RLL is an open-ended language in which the user can add pieces of language
not provided in RLL's standard repetoir. Currently RLL can deal with
pieces such as slots, modes of inheritance, and specification of
functions. Slots conform to the definition given in the UNITS package
(see page ?). In fact, RLL borrowed much of its nomenclature, as well as
software, from UNITS. For example, it uses one of UNITS main mode of
inheritance, the Example link. This inheritance relationship corresponds
to set theory notion of "element-of". However, there is nothing in UNITS
which corresponds to RLL's use of functional specification. This domain
is currently being explored, because it would help unify many outwardly
diverse concepts, such as processes, mechanisms, and slots.
The initial RLL system is itself a very versatile representation language.
For most tasks, the user can use it as he might any other representation
language. What distinguishes RLL is that the user is not forced to follow
the constraints imposed by a particular language; instead, he can mold his
copy of RLL to accomodate his particular task.
RLL derives this flexibility in two ways. First, RLL contains a large
library of largely independent, pre-fabricated "representational pieces".
For example, there are many (mutually incompatible) ways in which one can
associate facts with an object. One corresponds to UNIT's idea of a slot,
while another is the use of pointer to a list or relevant assertions. For
example, consider how to represent "Fred is the Father of Mary". Using
slots, "Father" slot of the unit Mary would be filled with the value
"Fred". Using links, the unit Mary would point to the assertion "(Father
Mary Fred)". The first method, using explicit slots, is "active" in the
initial RLL system. If this proves unsatifactory, a simple command will
instruct RLL to switch to the second method. From then on, (or, at least,
until the user's next alterring command) the user modified version of RLL
will prevail. Furthermore, RLL will automatically convert the user's
existing data into the new format.
To effectively use the variety of components, RLL must "understand" what
each does, and how. This information allows RLL to mesh diverse parts
together to form a coherent and workable whole. As such, it should be
possible for the user to design a fairly arbitrary language by simply
choosing the precise amalgamation of the pieces he needs, leaving to RLL
the responsibility of fitting them together.
Cataloging of possible components will never be totally complete. RLL's
second approach towards generality is its collection of tools designed to
help the user fabricate new parts. A set of high-level operators are
provided so that the user can define new components or a type of
components, or refine existing ones.
For example, the user can define the "Parents" slot as the union of the
"Mother" and "Father" slots. That is, the value of the "Parents" slot of
an individual is a list consisting of the values of that person's "Mother"
and "Father" slots. This brief definition, Parent = Union(Mother Father),
is sufficient to tell RLL everything it needs to know about this slot.
RLL now knows to automatically invalidate the value stored for Fred's
parents if his mother remarries. Furthermore it knows that only some
units may have a Parents slots, i.e. those which have both a Mother and
Father slots. (Hence while each person or zebra should have parents, we
should not expect a VLSI chip to have such a slot, nor the unit
representing the class of functions.)
The language parts, however they were derived, become the language the
user can use for his task. If he later discovers limitations in this set
he need only replace or redesign the offending components. RLL's
collection of types of parts, and its high-level operators, make
modifications relatively simple. As shown above, RLL then does the
"busywork", such as reformatting the existing data to conform to the set
of new conventions.
Although interests have been expressed in using RLL for a variety of
applications, the system has only recently become sufficiently stable to
permit others to use it. RLL's power can be seen in the variety of tasks
in which it has been employed: an adventure game simulation an (internal)
exploration towards a more complete self-description of its various parts
using lower-level primitives, and a complex task in which RLL assumed the
role of expert-system builder, designing a program to handle with oil
spills.
3. Goals for the Coming Year
In the next year, we expect to accelerate the number and complexity of
applications of our tools, thereby providing more feedback to aid us in
improving them and designing new and better tools. We expect most of these
applications will be built almost exclusively by the interested experts
themselves, with less recourse to programmers and system designers than in
previous years. Some of our more experimental tools, such as MRS and RLL,
will be getting their first real trials this year.
AGE is planning, and has already begun, a large-scale test on a complex
application. A multi-facet display package is also on the drawing board.
Control opions which admit of simultaneous backward and forward chaining
of rules are expected. AGE will also push toward a general standardizing
and simplifying of its underlying data structures this year.
Our primary effort in the continued development of MRS is being devoted to
the construction of useful new "plug-in" modules and a few extensions to
the inference capabilities. The issues of primary concern at this point
include an implementation of default reasoning and reasoning with
equality. At a more theoretical level, we are expanding the system's
capabilities to reason about its own structure and behavior.
Specifically, we would like to give the user the capability to specify
"invariants" that the system will automatically enforce. We would also
like to implement a module that can reason about the structure of the
knowledge in particular application areas in order to choose or design an
appropriate representation. Both MRS and RLL are expanding their
capabilities to reason about their own structure and behavior. RLL is
being used to build the Eurisko system, a program which gathers its own
empirical data, and tries to synthesize new heuristic rules. Each project
is also continuing this year's theme, the increasing facility with which
an outside expert can sit down and construct knowledge based expert
system.
BIBLIOGRAPHY
make sure to include these:
Genesereth, M. R. & Lenat, D. B. A Modifiable Representation System,
HPP-80-20, Staford University Heuristic Programmin Project, October 1980.
Genesereth, M. R. , Greiner, R., Smith, D.: The MRS Manual,
HPP-80-24, Stanford University Heuristic Programming Project, November 1980.
Greiner, Russell and Lenat, Douglas B., "A Representation Language Language",
Proc. First AAAI Conference, Stanford, August, 1980.
Greiner, Russell, "RLL-1: A Representation Language Language",
Memo HPP-80-9,
Stanford University,
October, 1980.
Greiner, Russell and Lenat, Douglas B., "Details of RLL-1",
Memo HPP-80-23, Stanford University, October, 1980.
Also my mhpp memo aboutthe nature of heuristics.